Not…Until across European Languages: A Parallel Corpus Study
نویسندگان
چکیده
We present a parallel corpus study on the expression of temporal construction ‘not…until’ in sample European languages. use data from Europarl and create semantic maps by multidimensional scaling, order to analyze cross-linguistic language-internal variation. This paper builds formal typological work, extending it including conditional constructions, as well connectives type long as. In an investigation 7 languages, we find that (i) languages many more different constructions convey this meaning than was expected literature; (ii) combination polarity marking (negation/assertion) strongly correlates with connective. corroborate our results larger 21 An analysis clusters dimensions based enlarged dataset shows are not randomly distributed across space ‘not…until’-domain.
منابع مشابه
Computational and Linguistic Issues in Designing a Syntactically Annotated Parallel Corpus of Indo-European Languages
This paper reports on the development of the PROIEL parallel corpus of New Testament texts, which contains the Greek original of the New Testament and its earliest IndoEuropean translations, into Latin, Gothic, Old Church Slavic and Classical Armenian. A web application has been constructed specifically for the purpose of annotating the texts at multiple levels: morphology, syntax, alignment at...
متن کاملA Parallel Corpus for Evaluating Machine Translation between Arabic and European Languages
We present Arab-Acquis, a large publicly available dataset for evaluating machine translation between 22 European languages and Arabic. Arab-Acquis consists of over 12,000 sentences from the JRCAcquis (Acquis Communautaire) corpus translated twice by professional translators, once from English and once from French, and totaling over 600,000 words. The corpus follows previous data splits in the ...
متن کاملAlignment Across Oriental and Indo-European Languages
The linguistic characteristics of Oriental languages and Indo-European languages are very different. Using purely length-based algorithm could not produce high performance on aligning texts. This paper investigates the effectiveness of critical part-of-speech (POS) criterion on alignment under conditions of different search strategies and different register texts. Two metrics, recall and precis...
متن کاملA massively parallel corpus: the Bible in 100 languages
We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other En...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Languages
سال: 2022
ISSN: ['2226-471X']
DOI: https://doi.org/10.3390/languages7010056